5 research outputs found

    CiteTracked: A Longitudinal Dataset of Peer Reviews and Citations

    Get PDF
    Scientific dissemination is of central importance for the scientific process. This paper presents CiteTracked, a dataset of peer reviews and citation statistics covering scientific papers from the machine learning community and spanning six years. We describe and analyze the data collection of over 3,000 published papers, their peer review texts and citation counts, and depict possible usage directions. The dataset aims at fertilizing novel interdisciplinary work between fields such as scientometrics, information retrieval, computational linguistics and natural language processing to study the scientific publishing process
    corecore